On Utilizing Optimal and Information Theoretic Syntactic Modeling for Peptide Classification

نویسندگان

  • Eser Aygün
  • B. John Oommen
  • Zehra Cataltepe
چکیده

Syntactic methods in pattern recognition have been used extensively in bioinformatics, and in particular, in the analysis of gene and protein expressions, and in the recognition and classification of biosequences. These methods are almost universally distance-based. This paper concerns the use of an Optimal and Information Theoretic (OIT) probabilistic model [11] to achieve peptide classification using the information residing in their syntactic representations. The latter has traditionally been achieved using the edit distances required in the respective peptide comparisons. We advocate that one can model the differences between compared strings as a mutation model consisting of random Substitutions, Insertions and Deletions (SID) obeying the OIT model. Thus, in this paper, we show that the probability measure obtained from the OIT model can be perceived as a sequence similarity metric, using which a Support Vector Machine (SVM)-based peptide classifier, referred to as OIT SVM, can be devised. The classifier, which we have built has been tested for eight different “substitution” matrices and for two different data sets, namely, the HIV1 Protease Cleavage sites and the T-cell Epitopes. The results show that the OIT model performs significantly better than the one which uses a Needleman-Wunsch sequence alignment score, and the peptide classification methods that previously experimented with the same two datasets.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Peptide classification using optimal and information theoretic syntactic modeling

We consider the problem of classifying peptides using the information residing in their syntactic representations. This problem, which has been studied for more than a decade, has typically been investigated using distance-based metrics that involve the edit operations required in the peptide comparisons. In this paper, we shall demonstrate that the Optimal and Information Theoretic (OIT) model...

متن کامل

Data mining for decision making in engineering optimal design

Often in modeling the engineering optimization design problems, the value of objective function(s) is not clearly defined in terms of design variables. Instead it is obtained by some numerical analysis such as FE structural analysis, fluid mechanic analysis, and thermodynamic analysis, etc. Yet, the numerical analyses are considerably time consuming to obtain the final value of objective functi...

متن کامل

Modeling gene regulatory networks: Classical models, optimal perturbation for identification of network

Deep understanding of molecular biology has allowed emergence of new technologies like DNA decryption.  On the other hand, advancements of molecular biology have made manipulation of genetic systems simpler than ever; this promises extraordinary progress in biological, medical and biotechnological applications.  This is not an unrealistic goal since genes which are regulated by gene regulatory ...

متن کامل

Coordinating a decentralized supply chain with a stochastic demand using quantity flexibility contract: a game-theoretic approach

  Supply chain includes two or more parties linked by flow of goods, information, and funds. In a decentralized system, supply chain members make decision regardless of their decision's effects on the performance of the other members and the entire supply chain. This is the key issue in supply chain management, that the mechanism should be developed in which different objectives should be align...

متن کامل

Utilizing Generalized Learning Automata for Finding Optimal Policies in MMDPs

Multi agent Markov decision processes (MMDPs), as the generalization of Markov decision processes to the multi agent case, have long been used for modeling multi agent system and are used as a suitable framework for Multi agent Reinforcement Learning. In this paper, a generalized learning automata based algorithm for finding optimal policies in MMDP is proposed. In the proposed algorithm, MMDP ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009